Introspective Agents: Confidence Measures for General Value Functions

نویسندگان

  • Craig Sherstan
  • Adam White
  • Marlos C. Machado
  • Patrick M. Pilarski
چکیده

Agents of general intelligence deployed in real-world scenarios must adapt to ever-changing environmental conditions. While such adaptive agents may leverage engineered knowledge, they will require the capacity to construct and evaluate knowledge themselves from their own experience in a bottom-up, constructivist fashion. This position paper builds on the idea of encoding knowledge as temporally extended predictions through the use of general value functions. Prior work has focused on learning predictions about externally derived signals about a task or environment (e.g. battery level, joint position). Here we advocate that the agent should also predict internally generated signals regarding its own learning process—for example, an agent’s confidence in its learned predictions. Finally, we suggest how such information would be beneficial in creating an introspective agent that is able to learn to make good decisions in a complex, changing world. Predictive Knowledge. The ability to autonomously construct knowledge directly from experience produced by an agent interacting with the world is a key requirement for general intelligence. One particularly promising form of knowledge that is grounded in experience is predictive knowledge—here defined as a collection of multi-step predictions about observable outcomes that are contingent on different ways of behaving. Much like scientific knowledge, predictive knowledge can be maintained and updated by making a prediction, executing a procedure, and observing the outcome and updating the prediction—a process completely independent of human intervention. Experience-grounded predictions are a powerful resource to guide decision making in environments which are too complex or dynamic to be exhaustively anticipated by an engineer [1,2]. A value function from the field of reinforcement learning is one way of representing predictive knowledge. Value functions are a learned or computed mapping from state to the long-term expectation of future reward. Sutton et al. recently introduced a generalization of value functions that makes it possible to specify general predictive questions [1]. These general value functions (GVFs), specify a prediction target as the expected discounted sum of future signals of interest (cumulants) observed while the agent selects actions according to some decision making policy. Temporal discounting is also generalized in GVFs from the conventional exponential weighting of future cumulants to an arbitrary, stateconditional weighting of future cumulants. This enables GVFs to specify a rich ar X iv :1 60 6. 05 59 3v 1 [ cs .A I] 1 7 Ju n 20 16

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Computational Theory of Belief Introspection

Introspection is a general term covering the ability of an agent to reflect upon the workings of his own cognitive functions. In this paper we wi l l be concerned wi th developing an explanatory theory of a particular type of introspection: a robot agent's knowledge of his own beliefs. The development is both descriptive, in the sense of being able to capture introspective behavior as it exist;...

متن کامل

Homogeneous Networks of Non-Introspective Agents Under External Disturbances - H_infinity Almost Synchronization

This paper addresses the problem of “H∞ almost synchronization” for homogeneous networks of general linear agents subject to external disturbances and under directional communication links. Agents are presumed to be non-introspective; i.e. agents are not aware of their own states or outputs, and the only available information for each agent is a network measurement that is a linear combination ...

متن کامل

Logical Theories for Agent Introspection

Thomas Bolander: Logical Theories for Introspective Agents Artificial intelligence systems (agents) generally have models of the environments they inhabit which they use for representing facts, for reasoning about these facts and for planning actions. Much intelligent behaviour seems to involve an ability to model not only one’s external environment but also oneself and one’s own reasoning. We ...

متن کامل

ℋ∞ almost synchronization for homogeneous networks of non-introspective SISO agents under external disturbances

This paper addresses the problem of “H∞ almost synchronization” for networks of identical, linear agents under directed communication topologies. Agents are presumed to be non-introspective; i.e. agents are not aware of their own state or output, and every agent is only provided with a linear combination of its own output relative to that of the neighbors. Providing the solvability conditions, ...

متن کامل

ℌ1 almost synchronization for non-identical introspective multi-agent systems under external disturbances

In this paper, synchronization for multi-agent systems subject to external disturbances is studied, and the notion of “H∞ almost synchronization” is introduced. The objective is to suppress the impact of disturbances on the synchronization error dynamics in terms of the H∞ norm of the corresponding closed-loop transfer function to any arbitrarily value. We focus on networks of non-identical lin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016